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I. INTRODUCTION 

The network topologies on which many natural and 
synthetic systems are built provide ideal settings for the 
emergence of complex phenomena; see the reviews [IHS] . 
One well-studied manifestation of this, called a cascade or 
avalanche, is observed when under certain circumstances 
interactions between the components of a system allow an 
initially localized effect to propagate globally. For exam- 
ple, the malfunction of technological systems like email 
networks or electrical power grids is often attributable 
to a cascade of failures triggered by some isolated event. 
Similarly, the transmission of infectious diseases and the 
adoption of innovations or cultural fads may induce cas- 
cades amongst people in society. 

It has been extensively demonstrated [5HT7] that the 
dynamics of cascades depend sensitively on the patterns 
of interaction laid out by the underlying network. One of 
the goals of network theory is to provide a solid theoreti- 
cal basis for this dependence. In order to do this one must 
first construct network models which are both mathe- 
matically sound and which capture the salient features 
of their real-world counterparts. So far there has been 
limited success in this direction. Most existing analytical 
results derive from the class of random networks defined 
by the so-called configuration model |18l 119] . The degree 
distribution of a network, pk, specifies the fraction of its 
nodes (vertices) that have k incident edges. In the con- 
figuration model, one generates a network of size n and 
given by attaching, with appropriate probabilities, k 
stubs to each of a set of n nodes, and then randomly con- 
necting pairs of these stubs together to make complete 
edges. The major shortcoming of this approach is that in 
the limit n — oo the density of cycles of length three (tri- 
angles) in the resulting network will vanish. In contrast, 
it is well established that the presence of closed interac- 
tions in real-world networks engenders significant num- 
bers of these short cycles. This feature is usually quanti- 
fied using some version of the clustering coefficient, which 
has been described in a sociological context as the prob- 
ability that "the friend of my friend is also my friend" 
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Recently, Newman [5T] and Miller independently 
proposed an extension to the classical configuration 
model to include nonzero levels of clustering (even as 
n — > oo), thus opening the doors to the derivation of 
new analytical results for cascade dynamics on some- 
what more realistic network topologies. Newman's model 
(which is the primary focus of our investigation) intro- 
duces a joint distribution, pst, specifying the fraction of 
nodes that are each connected to s single edges and t 
triangles, thereby directly embedding triangles of inter- 
connected nodes into a locally tree-like structure. Since 
the parameter t controls the density of triangles it also 
determines the clustering coefficient. In addition, it was 
shown in [3T] how the generating function formalism of 
[23| can be applied to these networks to derive expres- 
sions for some of their fundamental properties. 

In this paper we demonstrate an analytical approach 
to determining the mean cascade size in a broad range of 
dynamical models on the clustered random networks of 
PTj . This approach extends the work of Gleeson and Ca- 
halane [53] and Gleeson [55] on locally tree-like networks 
which itself was built on methods introduced to study the 
zero-temperature random-field Ising model on Bethe lat- 
tices [55H55] . We consider a specific class of models which 
satisfy the following properties: (i) each node is assigned 
a binary value specifying its current state, active [dam- 
aged or infected) or inactive (undamaged or susceptible); 
(ii) the probability of a node becoming active (in a syn- 
chronous update of all nodes) depends only on its degree 
k = s + 2t and the number m of its neighbors who are al- 
ready active, this probability is termed the neighborhood 
influence response function F{m,k) [IHlISni; (iii) for any 
fixed degree k, F{m, k) is a nondecreasing function of to; 
and (iv) once active, a node cannot become deactivated. 
The list of processes which satisfy these constraints in- 
cludes, but is not necessarily limited to, site and bond 
percolation [ST, fHSl , fc-core decomposition [331 El] , and 
Watts' threshold model *6 . Each process is defined by 
choosing an appropriate F{m, k), as detailed in |25j . 

As well as determining the expected cascade size we 
also provide a cascade condition — that is, a condition 
specifying the circumstances under which the number of 
nodes active in the cascade will correspond to a non- 
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vanishing fraction of the total number of nodes in the 
network n (in the limit n — oo). The dependence of 
such a condition on the prescribed level of clustering has 
been the topic of much recent discussion [531 [55H57] . The 
main question under consideration is: "Does the presence 
of clustering in pst networks increase or decrease the ex- 
pected cascade size relative to its value in a nonclustered 
network with the same degree distribution?" We provide 
a general criterion to answer this question. 

We restrict our attention throughout to cascades on 
undirected networks; however, in theory our method 
should be extendable to directed networks [38]. We also 
note that while the generating function method of [21] 
has the added advantage over our approach that it can 
be used to calculate the entire distribution of cascade 
sizes, such an approach is not directly generalizable to 
the wider class of cascade processes considered here. 

The remainder of this paper is structured as follows. 
In Sec. |llj we describe our generalized approach to cas- 
cade dynamics on Newman's clustered random networks. 
An analytical expression for the mean cascade size and 
the cascade condition are derived. We define these in 
terms of an arbitrary response function F{m^k). The 
particular forms which these results take for various pro- 
cesses are given in Sec. [Ill] and we discuss in detail the 
site percolation problem and Watts' threshold model [5]. 
We investigate the relationship between clustering and 
the cascade condition in Sec. II VI 



II. CASCADE PROPAGATION 

Our task here is to show how the theory developed in 
[211 US] for cascades on locally tree-like networks can be 
modified such that it is applicable to the class of clustered 
random networks introduced in ^21^ . 

Let us begin by recalling some of the properties of that 
class. First, each network realization is defined by a joint 
distribution p^t specifying the fraction of nodes connected 
to s single edges and t triangles. The conventional degree 
of each node is, therefore, k = s + 2t and the degree 
distribution is 

oo 

Pk= PstSk.s+2t, (1) 

s,t=0 

where 6ij is the Kronecker delta. Second, the clustering 
coefficient C, following the definition given in [53], is 

^ 3 X (number of triangles in network) 3N/x 
(number of connected triples) A^3 

where 3A^a — n-J^st^Pst and -/V3 = nj^k {'zlP'^- Notice 
that upon substitution into Eq. (|2| the factors of n can- 
cel, allowing C to remain nonzero even as n — )■ 00. 

Now, turning to the theoretical analysis presented in 
[5^155] . we see that this was built entirely on the fact that 
the networks being considered were nonclustered, and so 



could each be well approximated by a tree in which con- 
nections extended strictly from level to level starting from 
an arbitrary root node. This then allowed the propaga- 
tion of a cascade to be modeled as a consecutive sequence 
of activations from a random child node on one level to 
its parent node on the next highest level. From a seed 
fraction of active nodes, the expected size of the ensuing 
cascade was found by iterating a simple recurrence rela- 
tion to convergence and then calculating the probability 
of activation of the root node (see Eqs. (l)-(3) of [24]). 

If wc arc to expand this approach to Pst networks we 
must first justify the use of the tree approximation in the 
presence of nonzero clustering. Observe, however, that 
in these networks clustering is generated solely through 
the motif of nonoverlapping triangles. Fitting this spe- 
cific type of clustering into the tree-based framework is 
straightforward; a triangle exists whenever an edge con- 
nects two nodes on the same level. Therefore, in terms 
of dynamics, the only difference from the nonclustered 
networks dealt with in [211 US] is that now we are faced 
with two distinct ways in which activations may propa- 
gate from one level to the next, see Fig. [TJ They may 
spread as in Fig. ^a.) from a child (c) to its parent (p) 
across a single edge or as in Fig. [ijb) from either child 
at the base of a triangle to the parent at its apex. 

(a) (b) 




FIG. 1: Level-by-level cascade propagation in a pst network 
using the tree approximation. Triangle corners are marked in 
black. 



A. Expected cascade size 

Following the methodology of [H] [53] then, let us 
model a generalized cascade as a recursive sequence of 
activations from child to parent and set up self-consistent 
equations for the probabilities involved. 

Considering first Fig. [IJa), let ai be the probability 
that the child is active conditional on its parent being 
inactive, and let ctq = 1 — cti be the corresponding con- 
ditional probability that the child is inactive. For conve- 
nience we represent this set of probabilities with the gen- 
erating function cr{x) — ao + (Jix. Similarly, in Fig. [ij^b), 
let T2 be the probability that both children are active, 
conditional on their parent being inactive, let ri be the 
conditional probability that only one child is active, and 
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let To = 1 — Ti — T2 be the conditional probability that 
neither child is active. The generating function for these 
probabilities is t{x) = tq + tix + T2x'^ . 

Of course, the node arrangements represented by 
Figs, [ij^a)-(b) usually exist in various combinations, and 
not exclusively of each other. By definition, in any given 
network realization a randomly chosen node will be di- 
rectly connected to s nodes via single edges and to 2t 
nodes via triangle edges, with probability Pst- Therefore, 
letting n^* be the probability that m of these neighbor- 
ing nodes are active, (j{x) and t(x) are related to that 
probability by the generating function 



s+2t 



G(x)=^n^*x™=[a(:.)]^[r(x)]*, 



(3) 



defined for each pairing of s and t. 

We are now in a position to write down an analyti- 
cal expression for ui. In terms of an arbitrary response 
function F(m, s + 2t), written Fm for short, we have 



0-1 



s+2t-l 



(4) 



s,t=0 



711—0 



where po is the seed fraction and (s) = J2s t '^Ps* ^he 
average number of single edges per node. Eq. Q is a 
self-consistent equation for ai since according to Eq. ^ , 
n^* is itself a function of the coefficients of <t{x) and 
t(x). We can read Eq. Q as follows: the probability 
of the child node in a randomly chosen single edge pair 
being active, conditional on its parent being inactive, is 
equal to the probability that it was either initially active 
(po)i or that (1 — po) it subsequently became active by 
copying the behavior of the m out of s -I- 2i — 1 of its own 
children that were already active. Note, the term spst/{s) 
is the probability of reaching a child with s single edges 
by traveling along a random single edge from its parent 
(seen). 

To obtain similar expressions for ti and T2 we must 
reflect the fact that in a triangle the state of either 
child may influence the state of the other. Referring to 
Fig. [ijb), the probability that one child is active regard- 
less of the state of the other is 



a = po + (1 - Po) 



OO 



i.t=0 ^ ' 



s+2(t-l) 

E 

m— 



''F,n, (5) 



the probability that one child is inactive if the other is 
inactive but will activate if the other is active is 



13- 



oo s+2(t-l) 



s,t=0 ^ ' 



m=0 



[Fm+1 — Fm] , (6) 



and finally the probability that one child is inactive even 
if the other is active is 7 = 1 — a — /3. In Eqs. ([sjl-Q, 
we use the fact that following a triangle edge from the 
parent leads to a child with t triangles with probability 



tpst I (t) ■ This child then has s single edges and t — 1 
triangles available to connect to its own children, giving 
its maximum number of active children (for the sum over 
m) as s + 2{t—l). Expressed in terms of the probabilities 
a and j3, self-consistent expressions for ri and T2 are given 
by 



Ti = 2a J, 



and 



T2 = a 



2a p. 



(7) 



(8) 



The form of Eq. ([t]) arises from the fact that the prob- 
ability of the parent in a triangle of nodes having one 
active child is equal to the probability that one child is 
active regardless of the state of the other (a), while the 
other is inactive regardless of the state of the other (7) , 
and there are two different ways in which this may be 
the case. Reading Eq. Q in the same way, we see that 
the probability of the parent node in a triangle having 
two active children is equal to the probability that both 
children are active regardless of each others' states (a^) 
plus the probability that one child is active and the other 
activates because of this with probability /?, again there 
are two ways in which the latter may occur. 

The propagation of a cascade through a Pst network is 
now almost fully defined. Given a seed fraction po, we 
solve Eqs. (|3|-(|8]) to find the steady-state values of the 
coefficients of the polynomials (t{x) and t{x), and then, 
using these, we determine the expected cascade size by 
calculating the probability of activation of the root node. 
This final probability is given by 

00 s+2i 

p = po + (i-po)5Ip-* En^*^- (9) 



Comparing this equation to Eq. ^ we see that here the 
root node, which has s single edges and t triangles with 
probability p^t , has no parent and so has s + 2t children. 
In Sec. IV we show that the analytical approach de- 



rived here is in excellent agreement with the results of 
numerical simulations on pst networks. 



B. Cascade condition 

Having established an analytical expression for the ex- 
pected cascade size in Eq. ([9|, we now turn to the deriva- 
tion of a cascade condition. This will determine the cir- 
cumstances under which the process of propagating acti- 
vations described by Eqs. (|3])-([8]) can generate a nonva- 
nishing mean cascade size from an infinitesimally small 
seed fraction po 0. 

We begin by observing that Eqs. ([3])-([8]) can be rep- 
resented as the steady state of a nonlinear system of 
the general form v'^"+-'^) = H(v("^), where v*^") = 
[cri^"^ ri^"^ r2''"'] . The trivial solution v = corre- 
sponds to an equilibrium state where cascades do not 
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occur. We can look for other solutions by applying a 
small perturbation away from this equilibrium and then 
considering the trajectories in a linearized version of the 
system. 

Applying this method we first linearize the generating 
function G{x) of Eq. Q about v = using a small pa- 
rameter e to measure the magnitude of the perturbation. 
Scaling the coefficients of (y{x) and t{x) as 0{e), that is 
CTi ~ eoi, Ti ~ efi and T2 ~ e?^, we expand G{x) as 

G{x) ~ l-e[sa^ + t{Ti+T2)-{sa\+tTi)x-tT2x'^], (10) 

up to terms of 0{e^). 

Our next step will be to substitute the coefficients of 
G{x) from Eq. ([lO| into Eqs. Before doing this, 

however, we further simplify our analysis by assuming 
Fq = 0. This implies that a node will never activate if 
none of its neighbors are active, and this is true, or a good 
approximation, in many cases of interest. With Fq = 
then, said substitution gives us a linear system which 
may be represented in the matrix form v("+^) = A- v("\ 
where 



(n) 
in) 



All A12 
A21 A22 



(11) 



and the elements of A are 



4 _ {{s'-s)Fi) . _ {stF2) , (stFi) (t)-(tFi) . 
^11 — US ;^12 - —1^) 



(s> (t_Fi) 



A21 



2{stFi){tFi] 



A. 



2((f^-t)Fi) 2((t^-t)(F2-Fi))(tFi) 



(12) 



Note, the application of Eq. ( 10 ) has allowed us to express 
Ti^") in terms of rz^") as {{t) - {tFi))T2^''^ / (tFi) , 

hence the reduction to the 2x2 system of linear equations 



represented by Eqs. ( 11 )-( 12 ) 



In order for this system to produce trajectories which 
will diverge from v = 0, in other words in order to pro- 
duce cascades, we require that the larger eigenvalue of A 
(both eigenvalues are real) be greater than one, A+ > 1 
|53| . This condition is satisfied if 

(t) [2(stFif - {({s' - s)Fi) - (.)) (2((t2 - t)Fi) - {t)) 
-2{tFi) lids' - s)Fi) (.)) ((<2 - tm Fi)) 



-{stFi){stiF2-Fi)) 



> 0. 



(13) 



Conversely, if the left hand side of Eq. ( 13 1 is negative 
then A_|_ < 1, and the trivial equilibrium is stable, so 
cascades do not occur. The boundary between these two 
regimes, one where cascades are observed and the other 
where they are not, is located precisely at the point where 
A4- = 1, or equivalently where the expression on the left 



III. RESPONSE FUNCTIONS 

In this section we will show how the generalized the- 
ory of Sec. [n] may be used to model a range of processes 
on pst networks. As stated in the introduction each spe- 
cific process will be defined by choosing an appropriate 
response function, and Eqs. (|3])-(|9| will then give the ex- 
pected cascade size. We consider the examples of site 
percolation and Watts' threshold model [B] in detail. 



A. Site and bond percolation 

The resilience of random networks in the face of in- 
discriminate breakdowns or coordinated attacks is a key 
concern across multiple disciplines from epidemiology to 
telecommunications. Modeling these types of events as 
percolating processes has proved to be very fruitful, al- 
lowing theorists to uncover formulas for, amongst other 
things, the size distribution of connected components 
;39, 40 and epidemic thresholds [4T. The two most basic 
models studied are uniform site percolation and uniform 
bond percolation. Here we show that both models may be 
considered as special cases of our generalized approach, 
corresponding to suitable choices for the response func- 
tion F{m,s + 2t). 

Following the approach of |42| . we frame our descrip- 
tion in the language of successive activations already in- 
troduced. We define a node as active if it is part of the 
giant connected component (GCC) of the network, and 
our choice of response function, Eq. (Il4|) or Eq. ( 18 1 be- 



low, determines the type of percolation under considera- 
tion, either site percolation or bond percolation respec- 
tively. When this activation process reaches steady-state, 
all nodes which are labeled as active have at least one 
active neighbor to which they are connected. Thus the 
fraction p of active nodes equals the size of the connected 
component, expressed as a fraction of the network size n. 
In the n — >■ 00 limit, only the giant connected component 
size scales with n, and so p gives the fractional size of 
the GCC. This can be seen also from the fact that in 
the limit of zero clustering, our equations reduce to the 
standard percolation equations for GCC size in config- 
uration model networks, as given in |39| . This method 
does not permit the calculation of finite-size connected 
components (see 

In uniform site percolation, each node is occupied with 
independent probability /i and an occupied node can be- 
come active in the cascade, i.e. form part of the giant 
connected component (GCC), if it has one or more ac- 
tive neighbors (who are already in the GCC). Unoccupied 
nodes can never become active. The response function 
for site percolation is therefore [25], 



F{m, s + 2t) 



if m = 0, 



(14) 



hand side of Eq. ( 13 1 is equal to zero 



p if m > 0. 
Using Eq. ([w]) in the po hmit of Eqs. Q-fjo]), and 
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noting that with this choice of response function 

s+2t 



J2 ^n^Fim, s + 2t) 



1 - CTo To 



(15) 



the expected size of the GCC (as n — )■ oo) is given by 
Eq. (|9|, and reduces to the simple form 



(16) 



Substituting Eq. ( 14 ) into our cascade condition Eq. ( 13 ) 



we derive the fohowing equation for the critical site per- 
colation occupation probability 

-s}- {s}){2f,{t' -t)- {t)) ~ 2A*2(st)2 = 0, (17) 

which, with = 1, is in agreement with Eq. (22) of [H]. 

In uniform bond percolation each edge is occupied with 
independent probability v and a node can become active 
only if it is linked to another active node by an occupied 
edge. Thus, a node with m active children has probability 
1 — (1 — J/)™ of becoming active itself. The appropriate 
choice of response function in this case is therefore [2S| , 



i^(m,s + 2t) 







if TO = 0, 



1 - (1 - i^)'" if TO > 0. 



(18) 



The approach outlined here is also applicable to 
two other closely related problems: susceptible-infected- 
recovered (SIR) disease transmission [11] HJ] and fc-core 
decomposition [331 IM]- In fact, it was shown in [^ that 
in the steady state the infected fraction in SIR may be 
mapped directly to the bond percolation problem. The 
latter was discussed in detail in 25 and the relevant 
response function for standard configuration model net- 
works was provided (see Eq. (10) of [25 ). With the in- 
troduction of triangles we simply update that response 
function F{m, k) by setting k — s + 2t and continue as 
above. 

Regarding epidemiological studies, the question of how 
clustering in networks of human interactions may influ- 
ence the size and persistence of outbreaks of infectious 
diseases has been the topic of much recent discussion 
[45l - f50] . In fact, much of the impetus for considering 
more complex topological motifs in studies involving net- 
worked structures in general has come from this source 
HUdSl. We will see in Sec. IV how the results obtained 



by us for site and bond percolation echo (albeit indi- 
rectly) a number of recent results from this literature 
concerning the effects of clustering. 



B. Watts' model 

In [6] Watts introduced a model of threshold dynamics 
on networks as a simple but plausible mechanism for how 
phenomena such as fads or rumors propagate in society. 



The most basic formulation of this model is as follows. 
In an undirected network of arbitrary degree distribution 
Pfc, assign to each node a random (frozen) threshold r 
drawn from a specified distribution. Then, starting from 
a small seed fraction of active nodes, poj synchronously 
update the state of each node based on the following de- 
cision rule: a node will become active if the fraction of its 
neighbors which are already active exceeds r, otherwise 
it will remain inactive. (We also stipulate that once ac- 
tive a node can not deactivate.) Repeating this updating 
process until a steady-state is reached, we call the final 
fraction of active nodes the cascade size. 

In ^25j Gleeson defined the response function for Watts' 
model in the context of a generalized approach to cas- 
cades on pk networks (see Eq. (2) of [ISj)- We can ex- 
tend this definition to Pst networks simply by setting 
k — s + 2t. From Eq. (2) of 23] this gives us 



F(to, s + 2t) = Cr 



2t 



(19) 



where m is the number of active neighbors and Cj. de- 
notes the cumulative distribution function (cdf) of the 
thresholds. If, for example, we require a Gaussian thresh- 
old distribution with mean R and standard deviation cr. 



then Eq. ( 19 1 becomes 



F(to, s + 2t) = 



1 + erf 



m/{s + 2t) - R 

G^/2 



(20) 



where erf(a;) is the error function. Note, i^(0, s + 2t) > 
here, meaning some nodes have negative thresholds, 
and so will activate even if none of their neighbors are 
active. It is possible, therefore, for such nodes to instigate 
a cascade even when po = 0. 

In a similar manner to before we obtain the mean 
cascade size and the cascade condition by substituting 
Eq. ( 19 ) into the relevant equations from Sec. [llj 



IV. EFFECTS OF CLUSTERING ON 
CASCADES 

We now turn to the investigation of how clustering can 
affect cascade dynamics on p^t networks. This requires 
first that we make an appropriate choice for the form 
of the joint distribution pst- Considering the question 
stated in the introduction: "Does the presence of clus- 
tering in Pst networks increase or decrease the expected 
cascade size relative to its value in a nonclustered net- 
work with the same degree distribution?" , we set 



Pst = PkSk,s+2f [(1 - f)St,0 + /^t,L(s+2t)/2j] 



(21) 



where / € [0, 1], and [-J is the floor function. 

Applying this definition, we construct pst from a given 
degree distribution such that a fraction / of the nodes 
in our network are attached to the maximum possible 
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number of triangles t = [(s + 2t)/2j while the remaining 
(1 — /) are attached to single edges only (i = 0). Upon 
substitution of Eq. (21) into Eq. ([2| we find that the 



clustering coefficient C can be expressed as 



E 



fe V2, 



(22) 



This linear relationship between C and / allows us to 
vary C continuously from its minimum value at / = to 
its maximum possible value obtained at / = 1, while pre- 
serving p/j throughout. We cannot guarantee, however, 
that degree-degree correlations will be preserved [55] , 

In Fig. [2] we have used Eq. (21 ) to verify our theory in 
the case of site percolation on p^t networks with Poisson 
degree distribution = z'^e~^ /kl. We plot our result for 



the GCC size from Eq. ( 16 ) against numerical simulations 
for two different values of the mean degree ^ = Efc ^Pk- 
In both cases we consider minimum clustering (/ = 0) 
and maximum clustering (/ — 1). Threshold values de- 



fined by Eq. ( 17 1 are also plotted (see caption for details). 

Observing the relative positions of the percolation 
thresholds in Fig. [2] (pentagrams) we note that they lend 
support in favor of(or, at least, do not contradict) the ar- 
gument that adding triangles decreases the cascade size. 
We showed in 35J that this is unambiguously the case in 
the bond percolation problem on z-regular p^t networks, 
i.e. those with pk = 6k, z (all nodes have z neighbors). 
However, since adding triangles to a z-regular network 
cannot affect its correlation structure, this meant that 
any effects which may have been introduced by allowing 
correlations to vary were automatically negated. Fur- 
thermore, it was explicitly demonstrated in [35" , and also 
|22j . that such effects may significantly complicate mat- 
ters. In Fig. [2] on the other hand, degree-degree correla- 
tions are not preserved. Therefore while this figure does 
validate the theoretical approach of the preceding sec- 
tions it does not permit us to draw definitive conclusions 
as regards the question of the change in the expected 
cascade size due to clustering alone. 

In order to do that we will follow the approach of 
(see also [37]) and focus our investigation on pst networks 
with z-regular p^. In particular, we consider the following 
joint distribution 



Pst = Sz 



(23) 



where z > 2. This choice shares some similarities with 



Eq. (21 1; however, here we are adding only one triangle 
to each of a fraction g of the nodes in a z-regular network. 
Substituting Eq. ( 23 ) into Eq. ( 13 ) we have, as the con- 
dition for cascades to occur (corresponding to A-|- > 1, 
see Sec.lTTtB)), 



where 



Fi{z^ -z)-z + gSc>0, 

Sc = 2 + Fi{6- 4z) + 2Fi^{z - 2)^ 

+ 2Fi^F2{z - 2f - 2Fi\z - 2f, 



(24) 



(25) 




FIG. 2: (Color online) Size of giant connected component p as 
a function of site occupation probability /i on pst networks of 
10^ nodes with Poisson degree distribution pk for two different 
values of the mean degree, z = 3 and z = 5. Numerical simu- 
lations averaged over 100 realizations (symbols) versus theory 
of Sec. [n] (solid lines). In both cases we consider minimum 
clustering / = and maximum clustering / = 1. In each of 
the four parameter settings we calculate the critical site oc- 
cupation probability from Eq. (171 and mark its position on 



the p, axis with a (yellow) pentagram. 



denotes the sum of the terms which introduce clustering 
into the network. This expression gives us an insight 
into how adding triangles alters the cascade size. Given 
a specific z we can determine the qualitative effect of 
clustering in the following way. First, set the expression 
on the left hand side of Eq. ( 24 ) equal to zero and solve 
for Fi 



at g = 0. This determines the value of Fi at 
the transition to the cascade regime in the nonclustered 
network; the well-known result of Watts [6^, Fi — l/(z — 
1). Next, substitute that Fi into Sc and observe its sign. 
If it is negative we conclude that introducing triangles 
will decrease the expected cascade size. If, on the other 
hand, Sc is positive, adding triangles will increase the 
cascade size. 

The justification for these last two statements follows 
simply from the fact that if Sc constitutes a negative 
contribution to the expression on the left hand side of 
Eq. (24) then increasing g, given that Fi — l/(z — 1), 
will break the inequality in Eq. ( 24 ) and take us into the 
regime where cascades do not occur. Alternatively, if Sc 
is shown to be positive then increasing the parameter g 
will ensure the inequality holds and cascades do occur at 
these parameter values. 

In Fig. [3] we have plotted Sc against z for three of 
the processes described in Sec. |III[ site percolation, bond 
percolation and Watts' model. In this last case we have 
chosen the following parameters; seed fraction po = Oj 
and a Gaussian threshold distribution with mean R fitted 
to Fi = l/(z — 1), and standard deviation a = 0.1. 

This plot indicates that adding triangles will decrease 
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FIG. 3: (Color online) Sum of the clustering terms from 
Eq. ( 24 1 , Sc, versus mean degree z on pst networks with 
z-regular degree distribution. Results from site percolation, 
bond percolation, and Watts' model are shown. As in Sec. |III| 
each process is defined by choosing an appropriate response 
function. For Watts' model the threshold distribution is 
Gaussian with standard deviation a — 0.1 and mean R, such 
that Fi — 1/(2 — 1). Note, only integer z values are realizable 
as z-regular networks. 



the expected cascade size in both site percolation and 
bond percolation. In other words, the occupation prob- 
ability needed for a giant connected component to exist 
(the percolation threshold) is increased in the presence of 
clustering. As mentioned above, we have already demon- 
strated in that this is the case for the latter of these 
two processes; to our knowledge this is the first statement 
of the corresponding result for site percolation. While 
these results are not directly applicable to models of the 
spread of disease, in light of the established connection 
between SIR epidemics and bond percolation we suggest 
that they may, nonetheless, be of some interest to re- 
searchers in that field. This statement is vindicated by 
the fact that analogous results have recently been estab- 
lished in a number of epidemiological studies which have 
shown that clustering can adversely affect the propaga- 
tion of a disease gSmSHSO]- 

Also of interest is the behavior of Sc for Watts' model. 
As z increases in Fig. [3] we see Sc vary from negative 
values for z < 3, through a regime of positivity, and back 
again to negative values for z > 29. This tells us that 
for z < 3 the presence of clustering will decrease the left 
hand side of Eq. (24) below zero, thereby decreasing the 
for 3 



expected cascade size; for 3 < z < 29 clustering will 
increase the expected cascade size; and finally for z > 29 
clustering will once more tend to decrease the expected 
cascade size. We note that qualitatively similar results 
are seen for different values of cr, the standard deviation 
of the thresholds. 

By way of validation, in Fig. |4]we plot the cascade size 
p against the mean of the threshold distribution R for 
Watts' model with joint distribution defined by Eq. (21 1, 



and otherwise the same parameter settings as in Fig. 3] 
(see caption for details) . We inferred from Fig. [3] that at 
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FIG. 4: (Color online) Expected size of cascade outbreak 
p versus mean i? of a Gaussian threshold distribution with 
a — 0.1 for Watts' model on graphs of 10^ nodes with z- 
regular degree distribution pk = Sk,z and joint distribution 
Pst defined by Eq. (21 1. Numerical simulations averaged over 
100 realizations (symbols) and theory of Sec. [TT] (solid lines), 
(a) z = 3: here increasing the level of clustering decreases the 
expected cascade size at any given R value; (b) z = 5: in- 
creasing the level of clustering increases the expected cascade 
size. 



z = 3 cascades become smaller as clustering is increased. 
This is what we observe in Fig. [df^a) . Contrastingly, at 
z = 5 cascades should become larger as clustering in- 
creases. This is verified by Fig. Qb). 

This dependence of the cascade size on the sign of the 
sum of the clustering terms in Eq. ([24|, Sc, may be ex- 
pressed succinctly as a condition on the response function 
F2, the probability of activation in the presence of two 
active neighbors. Specifically, if the value of F2 at the 
transition point for cascades in nonclustered z-regular 
networks (i.e., F2 evaluated at the parameters for which 
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Fi = l/(z — 1)) satisfies the condition 



> 



2z-3 



(z-2)(z-l)' 



(26) 



then adding triangles will increase the expected size of 
cascades. Alternatively, if F2 does not satisfy this in- 
equality, clustering will decrease the expected size of cas- 
cades. One may derive this condition by substituting 
the zero-clustering cascade condition Fx = l/(z — 1) into 
Eq. (24) and then solving for F2. Note that by substi- 



tuting the respective response functions for site and bond 
percolation, Eq. (14) and Eq. (18), into Eq. (26) one may 



confirm that for z > 2 this inequality is not satisfied, 
and thus that clustering decreases cascade sizes for both 
of these processes (increases the percolation threshold). 
Finally, note that Eq. (26) can also be arrived at by a 



simple counting argument which compares the spread of 
activations in a clustered random network to that in a 
nonclustered random network. We leave this discussion 
to the Appendix. 



for others it will decrease it. The example of this behav- 
ior provided in Fig. |3] corresponds to just one setting of 
parameters for Watts' model, namely a Gaussian thresh- 
old distribution with standard deviation a — Q.l and no 
seed nodes. We note, however, that further simulations, 
the results of which are not provided, have proved these 
observations to be robust against changes in a. We be- 
lieve, therefore, that these observations have significant 
implications for studies of the spread of behavior in social 
networks, such as for example [37] . 

Lastly, we must emphasize that the motif of nonover- 
lapping triangles in the model investigated here corre- 
sponds to just one of the many different ways in which 
nodes may cluster together in a network. An alternative 
model based on the idea of embedding cliques of nodes 
within a configuration type network was developed by 
Gleeson [43], while Karrer and Newman have recently 
proposed an approach which allows for a much broader 
range of clustering motifs than just triangles [52] . The in- 
vestigation of some of the questions discussed by us here 
in the context of these models is of significant interest. 



CONCLUSIONS 



Acknowledgments 



We have shown how the analytical approach to cascade 
dynamics on nonclustered configuration model networks 
first put forth by Gleeson and Cahalane in ,24 may be 
extended to the class of random networks with nonzero 
clustering described by Newman in [3T]. 

By adapting the approach of [3S] we have provided a 
general analytical expression for the expected size of a 
cascade outbreak and a cascade condition, in these more 
realistic network topologies. By the use of the response 
function mechanism both of these results may be applied 
to a range of processes including, but not necessarily 
limited to, site and bond percolation, fc-core decomposi- 
tion, SIR (susceptible-infected-recovered) disease trans- 
mission, and Watts' threshold model (see also [HT]). 

In addition to this, we have also considered the ques- 
tion of how the presence of clustering qualitatively affects 
the cascade condition. This question is further compli- 
cated by the fact that for heterogeneous degree distri- 
butions, adding triangles will alter the correlation struc- 
ture of the network [551 123 ■ We have therefore focused 
our investigation on clustered networks with z-regular de- 
gree distributions in which degree-correlation effects are 
absent. This enabled us to discover a condition on the 



response function of the process (see Eq. (26)) which de- 



termines the change in the expected size of the cascade 
due to clustering alone. 

For site and bond percolation we found that clustering 
will unambiguously decrease the cascade size: a result 
which bears analogy to recent results from the epidemi- 
ological literature concerning the effects of clustering on 
disease outbreaks [46 l I48H50] . For Watts' model, how- 
ever, matters are not so clear-cut. For certain values of 
z clustering may increase the mean cascade size, while 
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Appendix: Counting Argument for Condition on F2 

Here we give an intuitive argument for the effect of 
clustering on cascades in z-regular pst networks. This 
stands as an alternative derivation of the condition on 
F2 in the main body of this paper, see Eq. (26 1. 



We compare the spread of activation from a single node 
(colored black in Fig. [sja) and (b)) to two of its neigh- 
bors, and then further into the network. In configuration 
(a) the three nodes considered are not part of a trian- 
gle, and up to 2{z— 1) second neighbors may potentially 
be activated in this way. In configuration (b), the three 
nodes form a triangle, and only 2(z — 2) second neigh- 
bors are available for activation. We proceed to calculate 
the expected number of edges which may activate second 
neighbors in each configuration, and derive a condition 
under which clustering (configuration (b)) gives a larger 
number of expected activations than the corresponding 
nonclustered case (configuration (a)). First we consider 
configuration (a). Each of the two white nodes will be 
activated by the black node with probability Fi. If ac- 
tivated, a white node may in turn activate up to z — 1 
of its other neighbors. So we count the expected number 
of active edges (edges which are connected to an active 
node) on the right hand side of Fig. [sja) as 2i^i(z — 1). 

In configuration (b), the two neighbors of the active 
node are also connected to each other, leaving each with 
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FIG. 5: Spread of activation from a single node (colored black) 
in (a) a nonclustered network, and (b) a pst network with 
nonzero clustering. Note, the clustered Pst network may also 
contain single edges which are not part of any triangle; how- 
ever, such edges are also present in the nonclustered network 
and we are interested only in the differences introduced by 
adding triangles. 



z — 2 edges to other neighbors. These edges may become 
active edges in one of three ways. 

(i) Both white nodes are activated directly by their sin- 
gle active neighbor; this happens with probability 
Fi^, and gives 2(z — 2) active edges on the right 
hand side of Fig. pjb). 

(ii) One white node is activated directly by the active 
neighbor; the other white node then becomes ac- 
tive because it now has two active neighbors. This 



happens with probability 2Fi{F2 — Fi), and gives 
2(z — 2) active edges. 

(iii) One white node is activated directly by the active 
neighbor; the other white node does not activate 
even though it has two active neighbors. This hap- 
pens with probability 2i^i(l — F2), and gives z — 2 
active edges. 

The expected number of active edges on the right hand 
side of Fig. [5]Jb) is therefore 



2i^i^(z-2) 



4i^i(F2 
:2Fi{z~ 



-F,)iz~2) 
2)(F2-Fi 



2Fi(l 



F2){z-2) 
(A.l) 



This is larger than the value 2Fi{z — 1) found for config- 
uration (a) if 



1 



(A.2) 



To examine the effect upon the cascade threshold, we 
substitute the cascade condition Fi = l/(z — 1) for the 
threshold in a nonclustered z-regular network [6] into 
Eq. ( A.2 ) to obtain the condition given in Eq. ([26|). If this 



condition is satisfied, cascade propagation is more likely 
on the clustered z-regular network than on the nonclus- 
tered version. 
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